Autonomous robotic surgery has advanced significantly based on analysis of visual and temporal cues in surgical workflow, but relational cues from domain knowledge remain under investigation. Complex relations in surgical annotations can be divided into intra- and inter-relations, both valuable to autonomous systems to comprehend surgical workflows. Intra- and inter-relations describe the relevance of various categories within a particular annotation type and the relevance of different annotation types, respectively. This paper aims to systematically investigate the importance of relational cues in surgery. First, we contribute the RLLS12M dataset, a large-scale collection of robotic left lateral sectionectomy (RLLS), by curating 50 videos of 50 patients operated by 5 surgeons and annotating a hierarchical workflow, which consists of 3 inter- and 6 intra-relations, 6 steps, 15 tasks, and 38 activities represented as the triplet of 11 instruments, 8 actions, and 16 objects, totaling 2,113,510 video frames and 12,681,060 annotation entities. Correspondingly, we propose a multi-relation purification hybrid network (MURPHY), which aptly incorporates novel relation modules to augment the feature representation by purifying relational features using the intra- and inter-relations embodied in annotations. The intra-relation module leverages a R-GCN to implant visual features in different graph relations, which are aggregated using a targeted relation purification with affinity information measuring label consistency and feature similarity. The inter-relation module is motivated by attention mechanisms to regularize the influence of relational features based on the hierarchy of annotation types from the domain knowledge. Extensive experimental results on the curated RLLS dataset confirm the effectiveness of our approach, demonstrating that relations matter in surgical workflow analysis.
translated by 谷歌翻译
Accurate polyp segmentation is of great importance for colorectal cancer diagnosis and treatment. However, due to the high cost of producing accurate mask annotations, existing polyp segmentation methods suffer from severe data shortage and impaired model generalization. Reversely, coarse polyp bounding box annotations are more accessible. Thus, in this paper, we propose a boosted BoxPolyp model to make full use of both accurate mask and extra coarse box annotations. In practice, box annotations are applied to alleviate the over-fitting issue of previous polyp segmentation models, which generate fine-grained polyp area through the iterative boosted segmentation model. To achieve this goal, a fusion filter sampling (FFS) module is firstly proposed to generate pixel-wise pseudo labels from box annotations with less noise, leading to significant performance improvements. Besides, considering the appearance consistency of the same polyp, an image consistency (IC) loss is designed. Such IC loss explicitly narrows the distance between features extracted by two different networks, which improves the robustness of the model. Note that our BoxPolyp is a plug-and-play model, which can be merged into any appealing backbone. Quantitative and qualitative experimental results on five challenging benchmarks confirm that our proposed model outperforms previous state-of-the-art methods by a large margin.
translated by 谷歌翻译
While deep learning methods hitherto have achieved considerable success in medical image segmentation, they are still hampered by two limitations: (i) reliance on large-scale well-labeled datasets, which are difficult to curate due to the expert-driven and time-consuming nature of pixel-level annotations in clinical practices, and (ii) failure to generalize from one domain to another, especially when the target domain is a different modality with severe domain shifts. Recent unsupervised domain adaptation~(UDA) techniques leverage abundant labeled source data together with unlabeled target data to reduce the domain gap, but these methods degrade significantly with limited source annotations. In this study, we address this underexplored UDA problem, investigating a challenging but valuable realistic scenario, where the source domain not only exhibits domain shift~w.r.t. the target domain but also suffers from label scarcity. In this regard, we propose a novel and generic framework called ``Label-Efficient Unsupervised Domain Adaptation"~(LE-UDA). In LE-UDA, we construct self-ensembling consistency for knowledge transfer between both domains, as well as a self-ensembling adversarial learning module to achieve better feature alignment for UDA. To assess the effectiveness of our method, we conduct extensive experiments on two different tasks for cross-modality segmentation between MRI and CT images. Experimental results demonstrate that the proposed LE-UDA can efficiently leverage limited source labels to improve cross-domain segmentation performance, outperforming state-of-the-art UDA approaches in the literature. Code is available at: https://github.com/jacobzhaoziyuan/LE-UDA.
translated by 谷歌翻译
Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.
translated by 谷歌翻译
公平性是一个标准,重点是评估不同人口组的算法性能,它引起了自然语言处理,推荐系统和面部识别的关注。由于医学图像样本中有很多人口统计学属性,因此了解公平的概念,熟悉不公平的缓解技术,评估算法的公平程度并认识到医疗图像分析(媒体)中的公平问题中的挑战很重要。在本文中,我们首先给出了公平性的全面和精确的定义,然后通过在媒体中引入当前使用的技术中使用的技术。之后,我们列出了包含人口统计属性的公共医疗图像数据集,以促进公平研究并总结有关媒体公平性的当前算法。为了帮助更好地理解公平性,并引起人们对媒体中与公平性有关的问题的关注,进行了实验,比较公平性和数据失衡之间的差异,验证各种媒体任务中不公平的存在,尤其是在分类,细分和检测以及评估不公平缓解算法的有效性。最后,我们以媒体公平性的机会和挑战得出结论。
translated by 谷歌翻译
荧光镜检查是一种使用X射线来获得3D对象内部的实时2D视频,帮助外科医生观察病理结构和组织功能,尤其是在干预过程中。然而,它主要是由于低剂量X射线的临床使用而产生的,因此需要荧光镜检查技术。这种脱牙受到了成像对象与X射线成像系统之间的相对运动的挑战。我们通过提出一个自制的三阶段框架来应对这一挑战,从而利用荧光镜检查的领域知识。 (i)稳定:我们首先基于光流计算构建动态全景,以稳定X射线检测器的运动引起的非平稳背景。 (ii)分解:然后,我们提出了一种新型的基于掩模的鲁棒原理分析(RPCA)分解方法,以将探测器运动的视频分离为低级别背景和稀疏前景。这样的分解可容纳专家的阅读习惯。 (iii)denoise:我们终于通过自我监督的学习策略分别降低了背景和前景,并通过双侧时空滤波器将deno的部分融合到最终输出中。为了评估我们工作的有效性,我们策划了27个视频(1,568帧)和相应的地面真相的专用荧光镜数据集。我们的实验表明,与标准方法相比,它在降解和增强效果方面取得了重大改进。最后,专家评级确认了这种功效。
translated by 谷歌翻译
在呼吸运动下重建肺部锥体束计算机断层扫描(CBCT)是一个长期的挑战。这项工作更进一步,以解决一个具有挑战性的设置,以重建仅来自单个} 3D CBCT采集的多相肺图像。为此,我们介绍了对观点或Regas的概述综合。 Regas提出了一种自我监督的方法,以合成不足的层析成像视图并减轻重建图像中的混叠伪像。该方法可以更好地估计相间变形矢量场(DVF),这些矢量场(DVF)用于增强无合成的直接观察结果的重建质量。为了解决高分辨率4D数据上深神经网络的庞大记忆成本,Regas引入了一种新颖的射线路径变换(RPT),该射线路径转换(RPT)允许分布式,可区分的远期投影。 REGA不需要其他量度尺寸,例如先前的扫描,空气流量或呼吸速度。我们的广泛实验表明,REGA在定量指标和视觉质量方面的表现明显优于可比的方法。
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
在诊所,放射学报告对于指导患者的治疗至关重要。不幸的是,报告写作对放射科医师造成了沉重的负担。为了有效地减少这种负担,在此提出了一种从胸部X射线的报告生成的自动,多模态方法。我们的方法,通过观察到放射学报告的描述与X射线图像高度相关,具有两个不同的模块:(i)学习知识库。为了吸收嵌入上述相关性的知识,我们根据文本嵌入自动构建知识库。 (ii)多模态对齐。为了促进报告,疾病标签和图像之间的语义对齐,我们明确地利用文本嵌入来指导视觉特征空间的学习。我们评估所提出的模型的表现,使用来自公共IU和模拟 - CXR数据集的自然语言生成和临床疗效。我们的消融研究表明,每个模块都有助于提高所生成的报告的质量。此外,借助两种模块,我们的方法显然优于最先进的方法。
translated by 谷歌翻译
深度学习方法的成功依赖于标记良好的大规模数据集的可用性。然而,对于医学图像,注释这种丰富的训练数据通常需要经验丰富的放射科医师并消耗他们有限的时间。开发了几次学习以缓解这种负担,这使得竞争性表现仅具有几个标记的数据。然而,在几次拍摄学习中的一个至关重要的问题是关于在学习之前为注释的模板图像的选择,这影响了最终性能。我们在本文中提出了一个新的样本选择政策(SCP),以在几次拍摄的医疗地标检测的背景下选择要注释的“最值得”的图像。 SCP由三部分组成:1)建立预训练的深度模型的自我监督培训,以提取来自放射性图像的特征,2)本地化信息贴片的关键点提案,以及3)用于搜索最具代表性的样本的代表性分数估计模板。 SCP的优点是在三个广泛使用的公共数据集上的各种实验证明。对于单次医疗地标检测,其用途将头部测量和Handxray数据集上的平均径向误差减少14.2%(从3.595mm至3.083mm)和35.5%(4.114mm至2.653mm)。
translated by 谷歌翻译